Search - WEB MINING

[Industry research] master_thesis

Description: 音乐领域中文实体关系抽取研究实体关系抽取的任务是从文本中抽取出两个或者多个实体之间预先定义好的语义关系。本文将实体关系抽取定义为一个分类问题，主要研究内容是中文音乐领域的实体关系抽取。针对这一问题，本文首先构建了中文音乐实体关系语料库，然后分别采用了基于序列模式挖掘的无指导的方法和基于特征提取的有指导的方法来解决这一问题。 -Dissertation for the Master Degree in Engineering urgently needed to deal with, efficient Web classification method is to extract the required letter from the online Sea Scout information, the interest rate on the key technology, feature selection is an important foundation for text classification mining, the tomb of generalized information theory as the theoretical basis, the tomb of selection methods at the secondary entropy of mutual trust Chi characteristics of each feature, independent assessment, then the feature set to analyze the relationship between the characteristics and categories, from high-dimensional feature space selected remove the lining of the characteristics of effective text classification, reducing the dimension of the text feature space, improve the nature of the text classification
Platform: | Size: 1445888 | Author: xz | Hits:

[Windows Develop] WID33_WebMinne

Description: Web数据挖掘ID3算法源码，使用C+++语言开发而成，具有高效性。 -Web Data Mining ID3 algorithm source code, use C+++ language development is made, is highly efficient.
Platform: | Size: 377856 | Author: lhanxi | Hits:

[Other] The-programming-collective-wisdom

Description: 《集体智慧编程》(programming collective intelligence building smart web 2.0 applications)以机器学习与计算统计为主题背景，专门讲述如何挖掘和分析web上的数据和资源，如何分析用户体验、市场营销、个人品味等诸多信息，并得出有用的结论，通过复杂的算法来从web网站获取、收集并分析用户的数据和反馈信息，以便创造新的用户价值和商业价值。全书内容翔实，包括协作过滤技术（实现关联产品推荐功能）、集群数据分析（在大规模数据集中发掘相似的数据子集）、搜索引擎核心技术（爬虫、索引、查询引擎、pagerank算法等）、搜索海量信息并进行分析统计得出结论的优化算法、贝叶斯过滤技术（垃圾邮件过滤、文本过滤）、用决策树技术实现预测和决策建模功能、社交网络的信息匹配技术、机器学习和人工智能应用等。-The collective wisdom of programming " (programming collective intelligence building smart web 2.0 applications) Machine Learning and Computational Statistics background data and resources devoted to mining and analysis on the web, how to analyze the user experience, marketing, personal tastes, and many other information and draw useful conclusions, through a complex algorithm obtained from the web site, the collection and analysis of user data and feedback information, in order to create a new user value and commercial value. The book is informative, including collaborative filtering technology (associated product recommendation function), cluster analysis (in large-scale data sets to explore similar data subset), the core technology of the search engine (reptiles, index, query engine, pagerank algorithm), search massive amounts of information and statistical analysis to draw conclusions optimization algorithm, Bayesian filtering technology (spam filtering, text filtering), decisi
Platform: | Size: 28408832 | Author: chenlei | Hits:

[Technology Management] DATA-MINING-WEB-LOG

Description: matlab project and mphil project document is add here
Platform: | Size: 10240 | Author: karthick5 | Hits:

[Linux-Unix] twitter__util

Description: <社交网站的数据挖掘与分析>中的源代码：twitter__util.py脚本-《mining the web data》 code resource :twitter__util.py script
Platform: | Size: 2048 | Author: lotus_wu | Hits:

[JSP/Java] test_exam

Description: 在线考试系统，web系统自己做的在线考试系统，紫金矿业修改，来满足自己不同的要求，可以自动修改试卷，自动评分。-Online examination system, online examination system, web systems do Zijin Mining modified to meet different requirements, can automatically modify the papers, automatic scoring.
Platform: | Size: 9251840 | Author: tunxunText | Hits:

[Game Hook Crack] mjllq

Description: 这是网页游戏名将挖矿浏览器..供做网页游戏游戏的人参考用-This is a browser web games famous mining .. for doing web games game reference
Platform: | Size: 19456 | Author: 里申军 | Hits:

[Software Engineering] dawak80

Description: Periodic pattern mining is the problem that regards tempo- ral regularity. There are many emerging applications in periodic pattern mining, including web usage recommendation, weather prediction, com- puter networks and biological data. In this paper, we propose a Pro- gressive Timelist-Based Verication (PTV) method to the mining of pe- riodic patterns from a sequence of event sets. The parameter min rep, is employed to specify the minimum number of repetitions required for a valid segment of non-disrupted pattern occurrences. We also describe a partitioning approach to handle extra large/long data sequence. The experiments demonstrate good performance and scalability with large frequent patterns.
Platform: | Size: 161792 | Author: Moorthi | Hits:

[Other] LGME

Description: input: param: parameters of the LMGE algorithm param.mu, param.alpha, param.beta are regularization parameters. param.p: dimension of shared subspace param.k: number of nearest neighbors for Laplacian matrix X: input data Y: groundtruth labels output: model.W: matrix W Reference: Web and Personal Image Annotation by Mining Label Correlation with Relaxed Visual Graph Embedding Yi Yang, Fei Wu, Feiping Nie, Heng Tao Shen, Yueting Zhuang and Alex Hauptmann. contact: yyang@cs.cmu.edu -Web and Personal Image Annotation by Mining Label Correlation with Relaxed Visual Graph Embedding
Platform: | Size: 1024 | Author: Arron | Hits:

[Internet-Network] 1368884419740-

Description: 有越来越多的人热衷于做网络爬虫（网络蜘蛛），也有越来越多的地方需要网络爬虫，比如搜索引擎、资讯采集、舆情监测等等，诸如此类。网络爬虫涉及到的技术(算法/策略)广而复杂，如网页获取、网页跟踪、网页分析、网页搜索、网页评级和结构/非结构化数据抽取以及后期更细粒度的数据挖掘等方方面面，对于新手来说，不是一朝一夕便能完全掌握且熟练应用的，里面重点介绍其中的六种方式-There are more and more people are keen on doing web crawler (spider), there are more and more places require network reptiles, such as search engines, information gathering, monitoring public opinion and so on and so forth. Web crawler technology involved (algorithm/strategy) wide and complex, such as web access, web tracking, web analytics, web searching, page rank and structure/unstructured data extraction and the latter a more fine-grained data mining and other aspects, for novice, is not able to fully grasp overnight and skilled application, which focuses on one of the six ways
Platform: | Size: 7168 | Author: 小强 | Hits:

[Other] pcs3.0

Description: PCS流量统计系统是一款专业的网站流量统计,网站营销监控,网站用户行为,网站数据分析,互联网广告分析系统,为客户提供深入挖掘的网站流量交叉数据报告.在访客行为分析,网络营销分析和网站决策支持方面有独特的分析体系,为客户找到数据背后的真实有价值的东西,拿出可执行性建议. 网站流量统计,是通过统计网站访问者的访问来源、访问时间、访问内容等访问信息,加以系统分析,进而总结出访问者访问来源、爱好趋-PCS traffic statistics system is a professional web site traffic statistics, website marketing surveillance, web site user behavior, site data analysis, Internet advertising analysis system, providing customers with in-depth data mining site traffic intersection reported in visitor behavior analysis, network marketing analysis and decision support site has a unique analytical system for customers to find the truth behind the data valuable things come enforceability recommendations. site traffic statistics, site visitors through access to sources of statistics, access time, access to content, etc. access to information, to systems analysis, and then summed visitors to sources, hobby trends
Platform: | Size: 2101248 | Author: yiyi | Hits:

[JSP/Java] log_mining

Description: 数据预处理，用户识别，会话识别，网络路径补充-Web data mining, Web using mining, data collection, pretreatment model, multi-granularity of behavior data preprocessing
Platform: | Size: 1807360 | Author: Joe | Hits:

[Algorithm] BucketOrders

Description: 桶序挖掘，即从一组关于多个对象的排列中挖掘出一个桶排列，可以应用到很多方面，例如文档标记的产生、用户浏览网页习惯分析、化石现场的时序挖掘等问题。桶序，即一组对象的有序划分，其中桶与桶之间形成一个全序，而同一桶内的对象之间无先后顺序。-Digging bucket sequence, from a group of a plurality of objects arranged on a digging bucket arrangement can be applied to many areas, such as the generation of document markup, habits of the user browsing the web, the timing of fossil mining site issues. Barrel sequence, ie a set of objects orderly division, which is formed between the barrel and the barrel a total order, and the same barrel between objects in no particular order.
Platform: | Size: 2801664 | Author: zhang29 | Hits:

[JSP] tt

Description: 《自己动手写网络爬虫》介绍了网络爬虫开发中的关键问题与Java实现。主要包括从互联网获取信息与提取信息和对Web信息挖掘等内容。《自己动手写网络爬虫》在介绍基本原理的同时注重辅以具体代码实现来帮助读者加深理解，书中部分代码甚至可以直接使用。 -" Do it yourself write web crawler" describes the development of web crawlers key issues and Java. Including access to information from the Internet and extract information and other content on Web information mining. " Web crawler to write himself" in the introduction of the basic principles while focusing supplemented by specific code to help readers to deepen understanding, the book part of the code can even be used directly.
Platform: | Size: 26259456 | Author: alex | Hits:

[JSP/Java] Id3

Description: web数据挖掘技术的决策树算法ID3的Java源代码。-web data mining technology ID3 decision tree algorithm Java source code. . . . . . .
Platform: | Size: 4096 | Author: 李宝极 | Hits:

[File Operate] CMSimple_a5

Description: 简易内容管理系统 CMSimple是一种自由的内容管理系统。标榜架构简单，程式档案小并且反应快速。系统以PHP编写，可以在各种平台上运作。 CMSimple不需要使用数据库，所有的页面资料都以一个超文件档案格式，名称为content.htm，存于万维网服务器上。纯文字档的资料设计使得安装与备份工作相对较为简单。 CMSimple的后台管理，仅允许单一用户使用。透过后台管理，可设定各种与页面展示相关的参数。与许多内容管理系统相同，CMSimple提供延伸程式开发架构的设计，可以让程式开发者自行建构原始CMSimple以外的功能。 CMSimple授权采GPL v3、Affero通用公共许可证 v3、Linkware和商业许可四种。-Simple Content Management System CMSimple is a free content management system. Advertised simple structure, small program files and quick response. System written in PHP that can run on a variety of platforms. CMSimple not need to use the database, all the data are in an ultra-page document file format, name content.htm, stored on the Web server. Information design makes installation a plain text file with a backup job is relatively simple. CMSimple management background, allowing only a single user. By Admin, you can set various parameters associated with the page display. With many of the same content management system, CMSimple provide extended program development framework designed to allow developers who construct their own original CMSimple function outside. CMSimple authorize mining GPL v3, Affero General Public License v3, Linkware and commercial licenses of four.
Platform: | Size: 1643520 | Author: 王磊 | Hits:

[Other] p_fangbaidu_kuaso

Description: 仿百度搜索引擎软件蜘蛛组件包括三大功能模块：链接采集、网页分析、无效网页扫描；自动识别GB2312、BIG5、UTF-8、Unicode等网页编码；文件类型证察防止非文本类型文件采集；蜘蛛可以采集ASP、PHP、JSP等动态数据网页和HTML、SHTML、XHTML等静态网页；支持续采功能，如果因系统、网络等故障问题终止采集，系统将在下次启动采集时提示您是否“继续采集”或“结束任务”；采集任务管理功能可以设置多个采集任务安排计划工作，每一个采集任务将会顺次运行；本程序完全高仿百度，有自主开发的蜘蛛智能抓取网页功能，非网络上仅仅只是界面模仿的免费程序！程序包含15大功能！ 1.网页搜索 2.搜索风云榜 3.网址导航 4.竞价排名 5.蜘蛛智能抓取网页 6.网站qp值智能排名 7.后台违法关键字过滤 8.网站智能分类 9.违法作弊网站一键删除 10.网站登录入口 11.信息反馈留言板 12.搜索右侧自定义广告 13.已收录网站和网页统计 14.网站一键收录 15.客户端蜘蛛系统和web蜘蛛系统-Imitation Baidu search engine spiders software component consists of three functional modules: link collection, web analytics, invalid page scanning Automatic identification GB2312, BIG5, UTF-8, Unicode and other web coding File Type Certificate police to prevent non-text type file collection Spider can collect ASP, PHP, JSP and other dynamic data pages and HTML, SHTML, XHTML and other static pages Support the continued mining function, if the problem due to the fault systems, networks, and other termination of the acquisition, the system will prompt the next time you start collecting continue to collect or End Task Acquisition task management function can set up multiple acquisition plan for the organization of work tasks, each task will be collected sequentially run The program is completely high imitation Baidu, has developed intelligent spiders crawl the web function on a non-network interface merely imitate free program! Program includes 15 major func
Platform: | Size: 2680832 | Author: 阿亮 | Hits:

Category

Source Code

Web/Internet

Develop Tools

Document

Other

Search in results

OS

Platform

Language

File Type

Search list